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I.  Overview 


We  begin  with  an  outline  of  the  research  effort. 

We  developed  theory  and  methods  for  optimal  digital  data  hiding  in  arbitrary  transform 
domains  of  digital  hosts  (images,  video,  audio).  Our  optimality  criteria  are  mean-square 
host  distortion,  recovery  error  rate,  and  the  Shannon  capacity  of  the  covert  channel. 
Additionally,  we  introduced  for  the  first  time  the  concept  of  multiuser/multi-signature 
steganography. 

Finally,  we  developed  new  counter-measures  to  (optimal  multiuser)  steganography  in  the 
form  of  active  (message  extraction)  and  passive  (stego/non-stego  decision)  steganalysis. 

II.  Research  Breakthrough:  Optimal  Multiuser  Embedding 

The  following  two  steps  describe  in  a  most  concise  manner  the  developed  optimal 
embedding  (data  hiding)  procedure. 

•  Data  preparation:  Partition  host  in  blocks;  take  transform  of  choice  of  each  block  (e.g.  DCT, 
wavelet,  or  else);  choose  subset  of  transform-domain  coefficients  of  your  liking  (e.g.  all  except 
dc);  call  chosen  coefficients  per  block  vector  xlxi. 

•  Embedding  with  multiple  signatures:  In  each  transform-domain  block  host  vector  x,  hide  K  bits, 

-  ■•ibK,  each  with  corresponding  user  signature  Sj  and  embedding  amplitude  Ai, 
j  =  1, 2, . . . ,  /i ;  if  desired,  account  for  external  white  Gaussian  noise  n  ~  A''"(0,  (tHl): 

{II  -  CiSisj )  is  a  projection  manipulator  of  host  x  parameterized  in  ci,  C2, . . . ,  ck- 


The  optimal  embedding  signatures  and  sealar  parameters  to  be  used  in  the  above  equation 
are  tabulated  below. 
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Optimal  pairs,  i  =  1, . . . ,  /l : 

opt 

s/  =qt_i+i 

~  2Ai,_i+i 

where  Qi, . . . ,  are  eigenvectors  of  =  E{xx^}  with  eigenvalues  Ai  >  A2  >  . . .  > 
and  Vi  =  Al  +  Qsf  RxSj,  i  =  h...,K. 


These  assignments  eomplete  the  deseription  of  optimal  multiuser  steganography. 


III.  Steganography  Experimental  Studies 

Below,  we  present  an  example  where  the  optimal  proeedure  of  Seetion  II  is  applied. 
Figure  1  shows  the  original  256  x  256  gray-scale  “Baboon”  image. 


Figure  1 


Figure  2  shows  the  same  image  after  optimal  embedding  of  K=1 5  messages  of  size  1  Kbit 
each,  equal  per-message  distortion,  total  distortion  31 .8dB,  and  additive  white  Gaussian 
noise  -for  the  sake  of  generality-  of  3dB. 
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BER 


Figure  2 


Figure  3,  below,  shows  the  bit-error-rate  (BER)  versus  host  for  the  15  messages  hidden  in 
Fig.  2. 
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Figure  3 
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In  Figure  4,  we  present  the  sum-capacity  of  the  covert  channel  as  a  function  of  the  total 
distortion  (K=1 5  messages).  The  presented  result  is  the  average  over  the  whole  USC-SIPI 
image  database. 


Figure  4 


IV.  Research  Breakthrough:  Active  Multi-signature  Steganalysis 

In  the  following,  we  present  the  steps  of  the  novel  multi-signature  iterative-generalized- 
least-squares  (M-IGLS)  procedure  that  we  developed  which,  as  demonstrated  later  on, 
enables  effective  recovery  of  messages  hidden  by  conventional  spread-spectrum 
embedding  means  even  when  the  embedding  signatures  are  completely  unknown. 

We  begin  with  a  careful  formulation  of  the  problem. 
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•  For  steganalysis  effort,  reformulate  unknown  multi-signature  embedding  in  matrix  form 

Y  =  VB  +  Z 

where  Y  €  is  compound  data/observation  matrix,  BkxM  =  [bi, . . . ,  b^]^ 
unknown  matrix  with  rows  K  messages  of  size  lxK  -[^i,  ■  ■  ■ ,  va:]  unknown 
effective  signature  set  matrix  with  Vj  =  i4jSi,  i  =  and  disturbance  matrix 

that  contains  everything  else  (e.g.  manipulated  unknown  original  host,  noise,  etc.). 

•  Formulate  active  steganalysis  problem  as  a  joint  estimation/detection  problem  with  following 
(generalized)  least  squares  solution 


V,B  =  arg  min  ||R;^(Y  -  VB)||2 

VglLxK 

where  Rz  =  ZZ^/i\/. 

•  Problems:  Unfortunately,  solution  exhibits  complexity  exponential  in  KM ;  Rz  is  not  available 
since  cover  image  is  unknown. 


We  can,  however,  overcome  the  identified  problems  effectively  by  first  replacing  in  the 

optimization  formula  by  ^  Y  Y  /iV/  the  following 

procedure  that  was  theoretically  derived  applying  iterative  mean-square  principles. 
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V.  Steganalysis  Experimental  Studies 


Here,  we  examine  the  performance  of  the  developed  active  steganalysis  procedure  of 
Section  IV  when,  first,  unknown  data  hiding  was  carried  out  by  conventional  spread- 
spectrum  embedding  means  and,  next,  when  embedding  was  done  by  the  optimal 
procedure  of  Section  II. 

Figure  5  plots  the  average  bit-error-rate  versus  per  message  distortion  when  K=4 
messages  of  length  1  Kbit  are  embedded  by  conventional  spread- spectrum  means  in  the 
256  X  256  Baboon  image  of  Figure  1  together  with  3dB  additive  white  Gaussian  noise. 
Our  steganalysis  algorithm  performance  (black  line)  shows  that  the  intended  recipient  has 
(little)  advantage  only  when  they  use  for  recovery  the  optimal  minimum-mean- square- 
error  (MMSE)  filter  and  not  just  the  embedding  signature.  Practically,  our  steganalysis 
algorithm  renders  such  steganography  unusable/useless. 


Figure  5 


In  Figure  6,  we  carry  out  the  exact  same  study  but  for  optimal  data  hiding  as  described  in 
Section  II.  We  observe  that  at  the  19  to  20dB  -and  thereafter-  range  a  gap  opens  up 
between  our  steganalysis  BER  (black  line)  and  the  BER  of  the  intended  recipient  (blue 
line),  which  opens  up  a  window  of  opportunity  for  effective  steganography. 


7 


Figure  6 


IV.  Research  Breakthrough:  Passive  Multisignature  Steganalysis 


In  our  language,  passive  steganalysis  is  the  problem  of  deciding  in  favor  of  either 
presence  or  absence  of  a  (multi-signature)  spread-spectrum  hidden  message(s)  in  a  given 
digital  medium.  It  is,  therefore,  a  binary  hypothesis  testing  problem.  Passive  steganalysis 
is  envisioned  as  a  rapid  high- volume  scanning  technology  that  identifies  and  sets  aside 
for  further  scrutiny  suspicious  media.  With  this  understanding,  we  set  our  own  passive 
steganalysis  requirements  as  follows.  Our  algorithm  must  be  of  low  complexity  (for  rapid 
scanning  operation),  image/medium  independent  (broad  applicability  without 
modifications),  and  unsupervised  (we  should  not  be  expecting  embedding  examples  by 
our  foes). 

The  developed  algorithmic  procedure  (low-complexity,  medium-independent,  and 
unsupervised)  is  as  follows. 


8 


STEP  1: 

•  Run  single-signature  steganalysis  (IGLS)  of  [1]  P  times  with  arbitrary  distinct  initializations 

b(^)  ^  and  obtain  decisions  bj,  /  = 

•  Algorithm  summarized  below  where  Ry  =  ^  Xlm=i  ^LxM  =  +  Z 


1) p  =  0;  initialize  b^^^  €  arbitrarily. 

2) p  =  p  +  1; 

y(p)  ^  1  Yb^P-^>; 

M  ’ 

b(P>  =  sign  {y^R-1v<p)}. 

3)  Repeat  Step  2  until  b^^^  = 


[1]  M.  Gkizeli,  D.  A.  Pados,  S.  N.  Batalama,  and  M.  J.  Medley.  “  Blind  iterative  recovery  of  spread-spectrum  steganographic 
messages,”  in  Proc.  IEEE  Intern.  Conf  Image  Proc.,  Genova.  Italy,  Sept.  2005,  vol.  2,  pp.  11-14. 


Then,  correlate  among  obtained  decisions  as  follows. 

STEP  2: 

•  Set  pi  j  =  b^bj /Af,  Zj  j  =  1, .  . . ,  P,  i  7^  j,  normalized  cross-correlation  betA^een  decisions. 
The  Concept 

•  If  image  stego,  with  high  probability 

-  z  =  Ij *  *  .  . ,  P,  corresponds  to  one  of  K  hidden  messages; 

-  there  exists  a  \pi^j  \  close  to  1  (i.e.,  b^,  bj  decisions  on  same  message). 

•  If  image  clean 

-  hi^i  =  Ij .  -  * ,  P,  irrelevant; 

-  for  any  i  ^  j,  \pi  j  \  >  0.5  has  very  low  probability. 
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Here  is  now  the  complete  passive  steganalysis  algorithm. 


Ste9o=0;  i  =  Q. 

While  i  <  P 

i  =  i  +  1 

Execute  Gkizeli-Pados-Batalama-Medley  routine  [1]  and  obtain  decision  b^. 
If  \pij  I  >  7  for  any  1  <  j  <  i 
Stego=1;  i  =  P  -\-  1. 


End 


End 


■  Threshold  7  usually  chosen  in  [0,5,  0.9]  range;  larger  7  induces  lower  false  alarm  rate  Pfa  but 
higher  probability  of  miss  Pm  and  vice  versa; 

.  P  =  30  to  200. 

The  algorithm  shows  exceptional  promise  against  conventional  spread- spectrum 
steganography  as  demonstrated  in  our  experiments.  Figure  7(a)  and  its  zoom-in  in  (b) 
show  probability  of  correct  identification  versus  false  alarm  on  a  dataset  of  about  1,500 
images  [3],  [4]  and  compare  against  the  recent  feature  extraction  algorithm  in  [2]. 


(a)  (b) 


Figure  7 

[2]  Y.  Wang  and  P.  Moulin,  “Optimized  feature  extraction  far  learning-based  image  steganalysis,”  IEEE  Trans.  Inform. 
Forensics  and  Security,  vol.  2,  pp.  31-45  ,  Mar.  2007 

[3]  USC-SIPI  Image  Database,  [Online].  Available:  http;//sipi  use  edu/  database/database.cgi?volume=misc 

[4]  UCID-Uncompressed  Colour  Image  Database,  [Online].  Available  http://vision.cs.aston.ac.ukydatasets/UCID/ucid.html 
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VII.  Concluding  Remarks 


Optimal  data  hiding,  as  described  in  this  grant  report,  offers  vast  improvement  in 
recovery  error  rate/Shannon  capacity  versus  distortion  and  enables  highly  effective  multi¬ 
signature  embedding  (different  -potentially-  hidden  messages  for  different  points  of 
contact  along  the  chain  of  command  etc.). 

Our  developed  active  steganalysis  M-IGLS  hidden  message  extraction  algorithm  can 
destroy  conventional  SS  steganography.  However,  our  optimal  embedding  scheme  is 
resistant  to  M-IGLS  steganalysis  attacks,  especially  for  small  hidden  messages. 

Our  new  passive  (binary  hypothesis  testing)  steganalysis  procedure  offers  close  to  95% 
identification  success  rate  at  about  1%  false  alarm  when  used  on  hosts  with 
conventionally  spread-spectrum  embedded  messages.  We  have  not  done  tests  yet  on  host 
with  optimally  embedded  messages. 

Our  suggested  plan  for  continued  research  is  as  follows. 

Optimal  steganography:  Study  and  analysis  of  transforms,  host  partitions,  multi-signature 
assignments,  variable-length  signature  optimization. 

Active  steganalysis:  Research  on  the  effective  recovery  of  relatively  small  messages 
embedded  with  own  optimal  scheme. 

Passive  steganalysis:  Algorithmic  tests  and  modifications  against  optimal  embedding. 
Video  steganography  and  steganalysis:  Pioneer  this  new  research  area  with  uncompressed 
(raw)  and  compressed  video. 
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